Value-Decomposition Multi-Agent Actor-Critics

نویسندگان

چکیده

The exploitation of extra state information has been an active research area in multi-agent reinforcement learning (MARL). QMIX represents the joint action-value using a non-negative function approximator and achieves best performance on StarCraft II micromanagement testbed, common MARL benchmark. However, our experiments demonstrate that, some cases, performs sub-optimally with A2C framework, training paradigm that promotes algorithm efficiency. To obtain reasonable trade-off between efficiency performance, we extend value-decomposition to actor-critic methods are compatible propose novel (VDAC). We evaluate VDAC task proposed framework improves median over other methods. Furthermore, use set ablation identify key factors contribute VDAC.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Revisiting Natural Actor-Critics with Value Function Approximation

Actor-critics architectures have become popular during the last decade in the field of reinforcement learning because of the introduction of the policy gradient with function approximation theorem. It allows combining rationally actorcritic architectures with value function approximation and therefore addressing large-scale problems. Recent researches led to the replacement of policy gradient b...

متن کامل

Value-Decomposition Networks For Cooperative Multi-Agent Learning

We study the problem of cooperative multi-agent reinforcement learning with a single joint reward signal. This class of learning problems is difficult because of the often large combined action and observation spaces. In the fully centralized and decentralized approaches, we find the problem of spurious rewards and a phenomenon we call the “lazy agent” problem, which arises due to partial obser...

متن کامل

Task Coordination and Decomposition in Multi-Actor Planning Systems

We discuss a framework for coordinating self-interested agents that can be used to decompose a multi-agent task based planning problem into independent subproblems. This problem decomposition can be achieved by a simple protocol and allows the agents to solve their part of the problem without the need to interact with other agents and in such a way that the resulting plans can be seamlessly int...

متن کامل

مدلسازی احساسات در سیستمهای multi-agent یادگیرنده

این پایان نامه به بررسی نقش مثبت یا منفی احساسات روی کارایی عامل های یادگیرنده در یک محیط multi-agent می پردازد. در این راستا مدلی برای عامل های یادگیرنده دارای احساس معرفی می شود. برای بررسی نقش احساسات، یک محیط فرضی multi-agent شبیه سازی شده و حالت های گوناگونی در آن نظر گرفته می شوند. در حالت نخست، کارایی عامل هایی بررسی می شود که دارای احساس نیستند و فقط قابلیت یادگیری دارند. در دومین حالت...

15 صفحه اول

Exploring multi-actor value creation in IT service processes

Organizational information technology (IT) needs are served through increasingly complex configurations of people, technologies, organizations, and shared information. Ideally, an organizational IT service is valuable for both the providers and users of systems and solutions. However, mutually beneficial outcomes may be difficult to achieve within the configurations through which IT services ar...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2021

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v35i13.17353